Clustering WordNet word senses

نویسندگان

  • Eneko Agirre
  • Oier Lopez de Lacalle
چکیده

This paper presents the results of a set of methods to cluster WordNet word senses. The methods rely on different information sources: confusion matrixes from Senseval-2 Word Sense Disambiguation systems, translation similarities, hand-tagged examples of the target word senses and examples obtained automatically from the web for the target word senses. The clustering results have been evaluated using the coarsegrained word senses provided for the lexical sample in Senseval-2. We have used Cluto, a general clustering environment, in order to test different clustering algorithms. The best results are obtained for the automatically obtained examples, yielding purity values up to 84% on average over 20 nouns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disambiguating Noun Groupings with Respect to Wordnet Senses

Word groupings useful for language processing tasks are increasingly available, as thesauri appear online, and as distributional word clustering techniques improve. However, for many tasks, one is interested in relationships among word senses, not words. This paper presents a method for automatic sense disambiguation of nouns appearing within sets of related nouns — the kind of data one finds i...

متن کامل

Improving Distributed Representation of Word Sense via WordNet Gloss Composition and Context Clustering

In recent years, there has been an increasing interest in learning a distributed representation of word sense. Traditional context clustering based models usually require careful tuning of model parameters, and typically perform worse on infrequent word senses. This paper presents a novel approach which addresses these limitations by first initializing the word sense embeddings through learning...

متن کامل

Enriching WordNet Ontology using Coarse - Grained Word Senses

All technologies have been emerged during the vision of Semantic Web are helpful for knowledge applications in various research areas. Semantic Web will consist of a distributed environment of shared and interoperable ontologies. Since building ontologies from scratch is not an easy task and is a time-consuming process, there is another perspective, which studies the approaches for developing a...

متن کامل

Clustering Paraphrases by Word Sense

Automatically generated databases of English paraphrases have the drawback that they return a single list of paraphrases for an input word or phrase. This means that all senses of polysemous words are grouped together, unlike WordNet which partitions different senses into separate synsets. We present a new method for clustering paraphrases by word sense, and apply it to the Paraphrase Database ...

متن کامل

Clustering WordNet Senses Utilizing Modified and Novel Similarity Metrics CS 229 Final Project Report

Introduction We approach the problem of clustering senses in Princeton's WordNet (Fellbaum 1998), a manually created dictionary/thesaurus which attempts to model the structure underlying human concepts. A synset, the fundamental unit in WordNet, is represented by a group of synonyms and a gloss definition, and is connected through a variety of semantic links, such as hypernyms (type-of) or mero...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003